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DETAILED ACTION 

Response to Arguments 

1 . Applicant's arguments filed 1 1/23/07, regarding claims 1 , 2, 3, 6, 7, 10 - 14, 16 - 
24, and 26 - 31 have been fully considered but they are not persuasive. 

Applicant argues that Gillis does not teach computing a quantified representation 
of the semantic content of each document and comparing the quantified representations 
using a defined algorithm or metric (Amendment, page 14). 

The examiner disagrees, Gillis teaches identifying analogous structures in 
semantically distant knowledge domains by creating abstract representations of content 
(vectors) which are characteristic of a given domain of knowledge (source domain) and 
searching for similar representations in semantically distant (target) domains (col.1, 
lines 15 - 20). Creating vectors, which are characteristic of a given domain of 
knowledge (source domain) and searching for similar representations in semantically 
distant (target) domains, since vectors are created in source and target domains in 
order to identify analogous structures in both domains. 

Applicant argues that Gillis does not teach a semantic vector that has at least: a 
word or phrase appearing in the document or a synonym of said word or phrase; a 
weighting factor associated with said word or phrase or synonym; and a frequency 
value (Amendment, page 1 5). 
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The examiner disagrees, Gillis teaches computing a set of term vectors (col.1 1 , 
lines 36, and 37). Computing terni vectors implies semantic vector that has a word or 
phrase appearing in the document or a synonym of said word or phrase, since Gillis 
defines terms as words or phrases. 

2. Applicant's arguments, see applicant's argument, page 14, filed 1 1/23/07, vyith 
respect to the newly independent claims 33 - 35 have been fully considered and are 

' persuasive. 

Claim Rejections - 35 USC § 102 

3. The following is a quotation of the appropriate paragraphs of 35 U.S.C. 1 02 that 
form the basis for the rejections under this section made in this Office action: 

A person shall be entitled to a patent unless - 

(a) the invention was known or used by others in this country, or patented or described in a printed 
publication in this or a foreign country, before the invention thereof by the applicant for a patent. 

4. Claims 1, 2, 3, 6, 7, 10- 14, 16-24, and 26- 31are rejected under 35 
U.S.C. 102(a) as being anticipated by Gillis (US Patent 6,523,026). 

As per claims 1, Gillis teaches comparing semantic content of two or more 
documents, comprising: 

accessing two or more documents ("source and target domains"); performing a 
linguistic analysis on each document ("computing a set of vectors"; col. 10, lines 9-17); 
col.1 1, lines 36-40); 
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outputting a semantic vector for each document, said semantic vector having 
multiple components, Wherein each component of said semantic vector has at least: a 
word or phrase appearing in the document or a synonym of said word or phrase; a 
weighting factor associated with said word or phrase or synonym; and a frequency 
value ("computing a set of term vectors"; col.11, lines 36, and 37; col. 10, lines 18 - 20). 

As per claim 2, Gillis further discloses that the linguistic analysis comprises 
sentence analysis ("sentence in the individual documents"; coL43, lines 43-46). 

As per claim 3, Gillis further discloses that the sentence analysis comprises a 
syntactic analysis ("preferred stop list word include in the vectdrization") and a semantic 
analysis ("semantic similarity"; col. 39, lines 14 - 20; col. 35, lines 4 - 6). 

As per claim 6, Gillis further discloses that each component of the semantic 
vector can have multiple dimensions ("n dimensional space"; col.39, line 63 -col.40, line 
1). 

As per claim 7, Gillis further discloses that each component of the semantic 
vector further comprises a subordinate concept value ("cable" is the subordinate 
concept of term "telecommunications"; coL51, lines 30 - 35). 
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As per claim 10, Gillis further discloses that some of the components of the 
semantic vector have {main term - subordinate term pairs} as their first value ("cable" 
and "telecommunications" are related term pairs, wherein cable is the subordinate term 
of telecommunications; col.51 , lines 30 - 35). 

As per claim 1 1 , Gillis further discloses that the semantic vector is a multi- 
dimensional vector defined by the content of a semantic net ("n dimensional semantic 
space"; col.39, line 63 - col.40, line 1). 

As per claim 12, Gillis further discloses that the content of the semantic net is 
augmented by relative weights, strengths^ or frequencies of occurrence of the features 
within the semantic net ("frequency related weightings to term in the computation of 
summary vectors"; col.41 , lines 40 - 46). 

As per claim 13, Gillis further discloses that the output of said defined algorithm 
is a measure of at least one of semantic distance, semantic similarity, semantic 
dissimilarity, degree of patentable novelty and degree of anticipation ("semantic 
similarity"; col.4, lines 1 - 3). 

As per claim 14, 23, 24, 26, and 27, Gillis teaches comparing two or more 
documents, by: 
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linguistically analyzing two or more documents("computing a set of vectors"; 
col. 10, lines 9 -17); col.11, lines 36 -40); 

generating a semantic vector associated with each document("semantic vectors"; 
col. 39, lines 14-20); and 

comparing the semantic vectors using a defined metric ("summary vectors to be 
compared"; col.39, line 19, and 20; col.42, lines 2, and 3); 

wherein said metric measures the semantic distance between two documents as 
a function of the relative frequencies of common terms and of common {main term - 
subordinate term pairs} between the two documents ("semantically distant are 
individually represented at least 50 times"; col.48, lines 48 - 55; col.51, lines 30 - 35). 

As per claim 16, Gillis further discloses that a common term between two 
documents includes two terms that are synonyms (col.1 1 , line 8). 

As per claims 17, 28, Gillis further discloses that one or more of said two or more 
documents are located using an autonomous software or 'bot program ("software 
programs"; col. 10, lines 9-17; col. 25, lines 57 - 67). 

As per claims 18, and 29, Gillis further discloses automatically analyzes each 
document in a defined domain (source and target domains) or network by executing a 
series, of rules and assigning an overall score to the document ("average of component 
values"; col. 10, lines 9-17; col.41, line 66 -col.42, line 25). 
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As per claim 19, Gillis further discloses that all documents with a score above a 
defined threshold are linguistically analyzed ("generate term vectors and accept only 
records that match all the categories beyond some minimum threshold"; col.46, line 65 
"C0l.47Jine 11). 

As per claims 20, and 30, Gillis further discloses that the semantic vector is a 
quantification of the semantic content of each document ("semantic vectors"; col. 39, 
lines 14-20; col.1, lines 15-20). 

As per claim 21 , Gillis further discloses that the semantic vector can have 
multiple components, and each component can have multiple dimensions ("n 
dimensional semantic space"; col. 39, line 63 - col.40, line 1). 

As per claim 22, Gillis further discloses that each component of the semantic 
vector has a word or phrase appearing in the document or a synonym of said word or 
phrase (col. 11, line 8); 

a weighting factor associated with said word or phrase or synonym ; and a 
frequency value ("frequency related weightings to terms in the computation of summary 
vectors"; col.41 , lines 40 - 43). 



Application/Control Number: 10/766,308 Page 8 

Art Unit: 2626 

As per claim 31 , Gillis further discloses that the output of said defined algorithm 
is a measure of at least one of semantic distance, semantic similarity, semantic 
dissimilarity, degree of patentable novelty and degree of anticipation ("semantic 
similarity"; col.4, lines 1 - 3). 

Allowable Subject Matter 

5. Claims 33 - 35 are allowed over the prior art. The following is an examiner's 
statement of reasons for allowance: 

As to claim 33 - 35, Gillis does not teach or suggest that the defined metric is 
one of: Sqrt ( f 1 2 + f 2 2 + f 3 2 + f 4 2 + + f ( N - 1 ) 2 f N 2 ) n * 100 ,wherein f is a 
difference in frequency of a common term between two documents and n is the number 
of terms those documents have in common; or Sqrt(sum((w-Delta)A2*w- 
Avg))/(Log(n)A3*1000), wherein w-Delta is the difference in weight between two 
common terms, w-Avg is the average weight between two common terms, and n is the 
number of common terms, between two documents. 

Conclusion 

6. Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Leonard Saint-Cyr whose telephone number is (571) 
272-4247. The examiner can normally be reached on Mon- Friday. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Richemond Dorvil can be reached on (571) 272-7602. The fax phone 
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number for the organization where this application or proceeding is assigned is (571)- 
273-8300. 

Information regarding the status of an application may be obtained from the 
Patent Application Information Retrieval (PAIR) system. Status information for 
published applications may be obtained from either Private PAIR or Public PAIR. 
Status information for unpublished applications is available through Private PAIR only. 
For more information about the PAIR system, see http://pair-direct.uspto.gov. Should 
you have questions on access to the Private PAIR system, contact the Electronic 
Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a 
USPTO Customer Service Representative or access to the automated information 
system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. 
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